Setup

Loading libraries and data.

New Members: S1 and S2

Data Source: Raw Data Discord Server Insights. Download CSV file to the following repo: analytics > s3_data_activity > discord > raw_data

Procedure Summary: For Season 1 vs Season 2 comparison, import two CSV datasets into dataframes. Combine those dataframes with rbind. Visualize using the ggplot theme() below. Include vertical line separating the two seasons.

Description of Findings: Showing organic grow in Discord discovery.

## Parsed with column specification:
## cols(
##   interval_start_timestamp = col_datetime(format = ""),
##   discovery_joins = col_double(),
##   invites = col_double(),
##   vanity_joins = col_double()
## )
## Parsed with column specification:
## cols(
##   interval_start_timestamp = col_datetime(format = ""),
##   discovery_joins = col_double(),
##   invites = col_double(),
##   vanity_joins = col_double()
## )

Total Members: S1 and S2

Data Source: Raw Data Discord Server Insights. Download CSV file to the following repo: analytics > s3_data_activity > discord > raw_data

Procedure Summary: For Season 1 vs Season 2 comparison, import two CSV datasets into dataframes. Combine those dataframes with rbind. Visualize using the ggplot theme() below. Include vertical line separating the two seasons.

Description of Findings: Total (cummulative) discord member numbers going up and to the right. Vertical line separting seasons.

## Parsed with column specification:
## cols(
##   interval_start_timestamp = col_datetime(format = ""),
##   total_membership = col_double()
## )
## Parsed with column specification:
## cols(
##   interval_start_timestamp = col_datetime(format = ""),
##   total_membership = col_double()
## )

First Day Activation: S1 and S2

Data Source: Raw Data Discord Server Insights. Download CSV file to the following repo: analytics > s3_data_activity > discord > raw_data

Procedure Summary: For Season 1 vs Season 2 comparison, import two CSV datasets into dataframes. Combine those dataframes with rbind. Visualize using the ggplot theme() below. Include vertical line separating the two seasons.

Description of Findings: Voice participation went up in Season 2:

## Parsed with column specification:
## cols(
##   interval_start_timestamp = col_datetime(format = ""),
##   new_members = col_double(),
##   pct_communicated = col_double(),
##   pct_opened_channels = col_double()
## )
## Parsed with column specification:
## cols(
##   interval_start_timestamp = col_datetime(format = ""),
##   new_members = col_double(),
##   pct_communicated = col_double(),
##   pct_opened_channels = col_double()
## )

Total Retention: S1 and S2

Data Source: Raw Data Discord Server Insights. Download CSV file to the following repo: analytics > s3_data_activity > discord > raw_data

Procedure Summary: For Season 1 vs Season 2 comparison, import two CSV datasets into dataframes. Combine those dataframes with rbind. Visualize using the ggplot theme() below. Include vertical line separating the two seasons.

Description of Findings: Retention has not improved that much, but we have more new members.

## Parsed with column specification:
## cols(
##   interval_start_timestamp = col_datetime(format = ""),
##   new_members = col_double(),
##   pct_retained = col_double()
## )
## Warning: 2 parsing failures.
## row col  expected    actual                                                                  file
##  15  -- 3 columns 2 columns '../s2_data_activity/discord/raw_data/next-week-retention_290921.csv'
##  16  -- 3 columns 2 columns '../s2_data_activity/discord/raw_data/next-week-retention_290921.csv'
## Parsed with column specification:
## cols(
##   interval_start_timestamp = col_datetime(format = ""),
##   new_members = col_double(),
##   pct_retained = col_double()
## )
## Warning: 2 parsing failures.
## row col  expected    actual                                            file
##  15  -- 3 columns 2 columns './discord/raw_data/guild-retention_010122.csv'
##  16  -- 3 columns 2 columns './discord/raw_data/guild-retention_010122.csv'
## Warning: Removed 2 row(s) containing missing values (geom_path).

Total Visitor and Communicator: S1 and S2

Data Source: Raw Data Discord Server Insights. Download CSV file to the following repo: analytics > s3_data_activity > discord > raw_data

Procedure Summary: For Season 1 vs Season 2 comparison, import two CSV datasets into dataframes. Combine those dataframes with rbind. Visualize using the ggplot theme() below. Include vertical line separating the two seasons.

Description of Finding: Still below benchmark, but trending upward slight in season 2:

## Parsed with column specification:
## cols(
##   interval_start_timestamp = col_datetime(format = ""),
##   visitors = col_double(),
##   pct_communicated = col_double()
## )
## Parsed with column specification:
## cols(
##   interval_start_timestamp = col_datetime(format = ""),
##   visitors = col_double(),
##   pct_communicated = col_double()
## )

Total Message - Avg Message: S1 and S2

Data Source: Raw Data Discord Server Insights. Download CSV file to the following repo: analytics > s3_data_activity > discord > raw_data

Procedure Summary: For Season 1 vs Season 2 comparison, import two CSV datasets into dataframes. Combine those dataframes with rbind. Visualize using the ggplot theme() below. Include vertical line separating the two seasons.

Description of Finding: Average Messages per person trending down, but there are more messages in total in S2, relatively to S1.

## Parsed with column specification:
## cols(
##   interval_start_timestamp = col_datetime(format = ""),
##   messages = col_double(),
##   messages_per_communicator = col_double()
## )
## Parsed with column specification:
## cols(
##   interval_start_timestamp = col_datetime(format = ""),
##   messages = col_double(),
##   messages_per_communicator = col_double()
## )

Total Voice Activity: S1 and S2

Data Source: Raw Data Discord Server Insights. Download CSV file to the following repo: analytics > s3_data_activity > discord > raw_data

Procedure Summary: For Season 1 vs Season 2 comparison, import two CSV datasets into dataframes. Combine those dataframes with rbind. Visualize using the ggplot theme() below. Include vertical line separating the two seasons.

Description of Finding: Voice activity clearly went up in Season 2.

## Parsed with column specification:
## cols(
##   interval_start_timestamp = col_datetime(format = ""),
##   speaking_minutes = col_double()
## )
## Parsed with column specification:
## cols(
##   interval_start_timestamp = col_datetime(format = ""),
##   speaking_minutes = col_double()
## )

POAP Distribution

Data Source: McKethanor built a pipeline to the POAP Subgraph (w elixir), got the data and pushed to DAODash postgres database as bankless_poaps_w_count, with the columns: id, name and mint_count. You can filter for “bankless” and “community” to get events for Community Calls.

NOTE: There is missing and/or incomplete data here.

Procedure Summary: After querying DAODash and saving raw data in the Analytics Repo: analytics > s3_data_activity > poap_cc_claims > raw_data.

Description of Finding: Avg number of poaps claimed was 234 (up through Community Call #34). A lot of missing data still.

## Parsed with column specification:
## cols(
##   id = col_double(),
##   name = col_character(),
##   mint_count = col_character()
## )
## Warning: Problem with `mutate()` input `mint_count_2`.
## x NAs introduced by coercion
## ℹ Input `mint_count_2` is `as.numeric(mint_count)`.
## Warning in mask$eval_all_mutate(dots[[i]]): NAs introduced by coercion
## # A tibble: 1 x 1
##   total
##   <dbl>
## 1  7264
## Warning: Problem with `mutate()` input `mint_count_2`.
## x NAs introduced by coercion
## ℹ Input `mint_count_2` is `as.numeric(mint_count)`.

## Warning: NAs introduced by coercion
## Warning: Removed 4 rows containing missing values (position_stack).

Alternative source: Content Gateway API: https://prod-content-gateway-api.herokuapp.com/api/v1/graphql Query historical -> PoapV1 -> PoapTokenV1s

Can query all events containing “Bankless DAO Community” (for community call poaps), but still need to find number of poaps claimed at each event.

NOTE: Could also use POAP.xyz subgraph API - see hosted on TheGraph

Snapshot Votes across Season 0 - 2

Data Source: Update 2 Snapshot related pipelines. First, update the pipeline both snapshot_votes_sched.py and snapshot_proposal_sched.py in repo: analytics > graphql > exploratory.

The pipelines feed into the following tables in postgres db: - stg_bankless_snapshot_1 - bankless_snapshot_header_1

This is the query in DAODash:

#SELECT
#  voter,
#  choice,
#  created,
#  vote_id,
#  title,
#  s.proposal_id 
#FROM stg_bankless_snapshot_1 s 
#LEFT JOIN bankless_snapshot_header_1 h ON s.proposal_id = h.proposal_id

Save download as CSV saved to this repo: s3_data_activity > snapshot > raw_data > total_snapshot_votes_s0_s2

Procedure Summary: The updated pipeline and query for total_snapshot_votes_s0_s2 provides the raw data for visualization. Use the updated pipeline instead of trying to glue together separate data sets from S1 vs S2.

Description of Finding: The average votes per Snapshot Proposal is 484. 10 of 23 proposals exceeded that amount.

## Parsed with column specification:
## cols(
##   voter = col_character(),
##   choice = col_double(),
##   created = col_double(),
##   vote_id = col_character(),
##   title = col_character(),
##   proposal_id = col_character()
## )
## # A tibble: 1 x 1
##     avg
##   <int>
## 1 11136

Grants Committee Fund Distribution:

# --Season 1 (116 rows): timestamp_display < '2021-10-08'::DATE AND timestamp_display > '2021-06-14'::DATE  -- TOTAL: 11,288,361.28706
# --Season 2 (426 rows): timestamp_display > '2021-10-07'::DATE AND timestamp_display < '2022-01-08'::DATE  -- TOTAL: 10,780,178.00001

#WITH fund_flow AS (
#select 
#  'Out' as Direction,
#   ssb.from_address,
#   w.human_readable,
#   w.entity_type,
#   ssb.amount_display,
#   ssb.timestamp_display,
#   ssb.to_address
#from stg_subgraph_bank_1 ssb
#join public.bankless_wallet_entity_2 w on lower(ssb.from_address) = lower(w.wallet_address)
#-- note 'Out' is from_address
   
#union all 
   
#select 
#  'In' as Direction,
#  ssb.from_address,
#  w.human_readable,
#  w.entity_type,
#  ssb.amount_display,
#  ssb.timestamp_display,
#  ssb.to_address
#from stg_subgraph_bank_1 ssb
#join public.bankless_wallet_entity_2 w on lower(ssb.to_address) = lower(w.wallet_address)
#-- note 'In' is to_address
#),

#gcs1 AS (
#SELECT
#  direction,
#  from_address,
#  human_readable AS readable,
#  entity_type,
#  amount_display,
#  timestamp_display,
#  to_address
#FROM fund_flow 
#WHERE entity_type = 'Grants Committee'
#AND timestamp_display > '2021-10-07'::DATE  
#AND timestamp_display < '2022-01-08'::DATE
#AND direction = 'Out'
#--where timestamp_display > '2021-10-01T09:26:59'::TIMESTAMP 
#ORDER BY timestamp_display DESC
#)

#SELECT
#  g.direction,
#  g.from_address,
#  g.readable,
#  g.entity_type,
#  g.amount_display,
#  g.timestamp_display,
#  g.to_address,
#  b.human_readable 
#FROM gcs1 g 
#LEFT JOIN bankless_wallet_entity_2 b ON lower(g.to_address) = lower(b.wallet_address)

Visualization

## 
## Attaching package: 'scales'
## The following object is masked from 'package:purrr':
## 
##     discard
## The following object is masked from 'package:readr':
## 
##     col_factor
## Parsed with column specification:
## cols(
##   direction = col_character(),
##   from_address = col_character(),
##   readable = col_character(),
##   entity_type = col_character(),
##   amount_display = col_double(),
##   timestamp_display = col_datetime(format = ""),
##   to_address = col_character(),
##   human_readable = col_character()
## )
## Parsed with column specification:
## cols(
##   direction = col_character(),
##   from_address = col_character(),
##   readable = col_character(),
##   entity_type = col_character(),
##   amount_display = col_double(),
##   timestamp_display = col_datetime(format = ""),
##   to_address = col_character(),
##   human_readable = col_character()
## )
## `summarise()` regrouping output by 'human_readable' (override with `.groups` argument)

Bankless Vault Fund Distribution (+ Grants Committee)

Data Source:

  • Pipeline: Need to update the pipeline to the BANK subgraph in repo: analytics > graphql > exploratory > see bank_sched.py or mck_bank_subgraph5.py.

  • Bankless_wallet_entity: Need to consult this sheet to add additional project, guild multisig & address to bankless_wallet_entity_2 table in DAODash.

  • Postgres DB tables: stg_subgraph_bank_1 and bankless_wallet_entity_2

  • Saved Queries: Fund_Flow_Bankless_Vault and Fund_Flow_GC_2.

  • Raw data: 4 data sets needed to create full_grants which is used in the side-by-side bar chart and subsequent sankeys:

gcs1a: s1_fund_flow_gc_2_20220112T062448 (see previous step: Grants Committee) gcs2a: s2_fund_flow_gc_2_20220112T063018 (see previous step: Grants Committee) bv_s1a: s1_fund_flow_bankless_vault_20220113T082455 bv_s2a: s2_fund_flow_bankless_vault_20220113T082831

Procedure Summary: Borrowing a portion of RTJ’s query to create a limited sankey showing the flow from Grants Committee (and Bankless Vault address 0xf26d1bb347a59f6c283c53156519cc1b1abaca51) into various guilds, projects and initiatives.

Outstanding Question: Why is direction ‘in’ for Bankless Vault (in Fund_Flow_Bankless_Vault), but ‘out’ for Grants Committee (in Fund_Flow_GC_2)?

Description of Finding: Because In Season 1, Analytics Guild received funds from the Grants Committee address, in Season 2, the Analytics Guild received funds from the Bankless Vault address (0xf26d1bb347a59f6c283c53156519cc1b1abaca51), we need to combine two CSV datasets to see full picture of Season 1 vs Season 2 comparison.

See DaoDash: Fund_Flow_Bankless_Vault (nearly identical to Fund_Flow_GC_2, except direction is ‘In’).

## Parsed with column specification:
## cols(
##   direction = col_character(),
##   from_address = col_character(),
##   readable = col_character(),
##   entity_type = col_character(),
##   amount_display = col_double(),
##   timestamp_display = col_datetime(format = ""),
##   to_address = col_character(),
##   human_readable = col_character()
## )
## Parsed with column specification:
## cols(
##   direction = col_character(),
##   from_address = col_character(),
##   readable = col_character(),
##   entity_type = col_character(),
##   amount_display = col_double(),
##   timestamp_display = col_datetime(format = ""),
##   to_address = col_character(),
##   human_readable = col_character()
## )
## `summarise()` regrouping output by 'human_readable' (override with `.groups` argument)

## `summarise()` regrouping output by 'human_readable' (override with `.groups` argument)

## `summarise()` regrouping output by 'human_readable' (override with `.groups` argument)

Example Sankey

REVISED Sankey: Grants Committee Vault, Season 1

Data Source: See above with Bankless Vault Fund Distribution (+ Grants Committee) data sources.

Procedure Summary: Use full_grants data frame to create sankey for Season 1 and Season 2.

NOTE: full_grants <- rbind(gcs1a, gcs2a, bv_s1a, bv_s2a). See previous section. - full_grants combines grants committee + bankless_vault, across s1 and s2 - start with full_grants

Description of Finding: - 26 projects/guilds in Season 1 - 33 projects/guilds in Season 2

customize colors and see “pink nodes” for new projects in Season 2.

## # A tibble: 27 x 1
##    name                          
##    <chr>                         
##  1 BountyBoard                   
##  2 A/V Guild                     
##  3 Bankless Academy              
##  4 Analytics Guild               
##  5 Degen                         
##  6 Bankless Loans                
##  7 Writers Guild                 
##  8 Devs Guild Multisig           
##  9 Translators Guild             
## 10 Treasury Guild                
## 11 Ops Guild                     
## 12 Design Guild                  
## 13 Research Guild                
## 14 Marketing Guild               
## 15 First Quest                   
## 16 Bankless Website Multisig     
## 17 Liquity Project Multisig      
## 18 S1 Coordinape                 
## 19 BED Index                     
## 20 BizDev Guild                  
## 21 DAO Punks                     
## 22 Crypto Sapiens                
## 23 Legal Guild                   
## 24 Education Guild Multisig      
## 25 DevOps Infrastructure Multisig
## 26 Balancer Multisig             
## 27 Misc
## Links is a tbl_df. Converting to a plain data frame.

REVISED Sankey: Grants Committee Vault, Season 2

Data Source:

Procedure Summary:

Description of Findings:

## # A tibble: 34 x 1
##    name                                      
##    <chr>                                     
##  1 BountyBoard                               
##  2 A/V Guild                                 
##  3 DaoDash                                   
##  4 FightClub                                 
##  5 Content Gateway                           
##  6 Writers Guild                             
##  7 Marketing Guild                           
##  8 S1 Coordinape                             
##  9 DAO Punks                                 
## 10 Legal Guild                               
## 11 Balancer Liquidity Mining Program Multisig
## 12 International Media Node Multisig         
## 13 Book Club Multisig                        
## 14 Podcast Hatchery Multisig                 
## 15 Flipper Zone Multisig                     
## 16 Misc                                      
## 17 Bankless Academy                          
## 18 Analytics Guild                           
## 19 Degen                                     
## 20 Bankless Loans                            
## 21 NewsletterTeam                            
## 22 Devs Guild Multisig                       
## 23 Translators Guild                         
## 24 Treasury Guild                            
## 25 Ops Guild                                 
## 26 Design Guild                              
## 27 Research Guild                            
## 28 First Quest                               
## 29 Bankless Website Multisig                 
## 30 Liquity Project Multisig                  
## 31 BED Index                                 
## 32 BizDev Guild                              
## 33 Education Guild Multisig                  
## 34 DevOps Infrastructure Multisig
## Links is a tbl_df. Converting to a plain data frame.

Bounty Board

Data Source: Bounty Board team production database in Mongo. Because I was doing summary statistics comparing Season 1 and Season 2. I queried the database for metrics at a point in a time like so:

Query Database: query from database in prod. WIP.

Number of bounties: S1: 40, S2: 66+

Value committed to bounties: S1: 68K, S2: 209K

Update: 15/01/22 for numbers

Then I manually created a dataframe.

Procedure Summary:

Description of Finding: Number of Bounties and

Task: Query total number of bounties and total value committed to bounties between S1 v S2.

Coordinape Distribution

Data Source: - Pipeline: see data_request.py and coordinape_rounds.py in repo: analytics > coordinape - DAODash table: coordinape_rounds_3 - DaoDash query, Seasonal Coordinape Distribution - Raw data: see s3_data_activity > coordinape > raw_data > s1_coordinape_distribution and s2_coordinape_distribution.

Procedure Summary: Query from DAO Dash, save 2 CSV files. Run data transformations in R, but ultimately manually create dataframe with important summary data.

Description of Finding: Query comparison of tokens/bank distributed via Coordinape in S1 vs S2. Coordinape tokens, givers and receivers are all up.

## Parsed with column specification:
## cols(
##   id = col_double(),
##   coord_id = col_double(),
##   recipient_id = col_double(),
##   sender_id = col_double(),
##   tokens = col_double(),
##   new_timestamp = col_datetime(format = "")
## )
## Parsed with column specification:
## cols(
##   id = col_double(),
##   coord_id = col_double(),
##   recipient_id = col_double(),
##   sender_id = col_double(),
##   tokens = col_double(),
##   new_timestamp = col_datetime(format = "")
## )
## # A tibble: 1 x 1
##   sum_tokens
##        <dbl>
## 1     192076
## # A tibble: 1 x 1
##   sum_tokens
##        <dbl>
## 1     425561
## # A tibble: 302 x 2
##    sender_id     n
##        <dbl> <int>
##  1      2225   286
##  2      2202   278
##  3      2075   274
##  4      2178   271
##  5      2167   211
##  6      2240   128
##  7      2268   120
##  8      2056   112
##  9      2173   102
## 10      2267   100
## # … with 292 more rows
## # A tibble: 1 x 1
##       n
##   <int>
## 1   302
## # A tibble: 1 x 1
##       n
##   <int>
## 1   458
## # A tibble: 1 x 1
##       n
##   <int>
## 1   416
## # A tibble: 1 x 1
##       n
##   <int>
## 1   481

Tips Given & Received

Data Source:

  • Pipeline: McKethanor built this pipeline to our postgres db.
  • DaoDash table: tipping has most up-to-date info.
  • DaoDash saved query: Seasonal Tip Distribution

Season 1 Tip Distribution (2873 rows, 983,019 tips given), WHERE timestamp < ‘2021-10-08’::DATE AND timestamp > ‘2021-06-14’::DATE

Season 2 Tip Distribution (1336 rows, 337,173 tips given), WHERE timestamp > ‘2021-10-07’::DATE AND timestamp < ‘2022-01-08’::DATE

Procedure Summary: Get comparison between S1 and S2, create tibble/dataframe to display data. Ultimately create own dataframe.

Description of Finding: Tipping has dropped by -66%

Number of Bank Holders (Membership Categories); S1 vs S2

Data Source: Due to inconsistency between queries in Dune Analytics, DAO Dash and even Google Biq Query, more time needs to be spent to reconcile differences between seemingly identical queries.

In the short-term, use Dune Analytics.

Procedure Summary:

Description of Findings:

Task: Compare S1 vs S2 on all membership categories.

See mem_1_10K_bank_holdings_time_range and other segments downloaded in raw csv. Data gathered from DAODash table.

See also: - BANK Holders 10 - 35K - BANK Holders 35 - 150K - BANK Holders 150K - 1.5M

## 
## Attaching package: 'lubridate'
## The following objects are masked from 'package:base':
## 
##     date, intersect, setdiff, union
## 
## Attaching package: 'zoo'
## The following objects are masked from 'package:base':
## 
##     as.Date, as.Date.numeric
## Parsed with column specification:
## cols(
##   day = col_datetime(format = ""),
##   address = col_character(),
##   sum_balance = col_double()
## )

Discourse

Data Source:

Procedure Summary:

Description of Findings:

Task: Connect to API to query forum posts with polls and number of people voting.

Challenge: API documentation does not mention Polls or Votes. See documentation here.

Progress: Established API connection with the pydiscourse https://github.com/BanklessDAO/analytics/blob/main/discourse_forum/api_connection.py

Ended up downloading CSV manually of all users, instead of votes, will track other engagement metrics like: - topics_entered - posts_read_count - time_read - topic_count

note: remove email column, then save raw data to github

First Quest

Task: Query number of new members going through First Quest

Challenge: Data inserted at 2021-11-18, might have insufficient data to make S1 vs S2 comparison.

Just show S2 First Quest data standalone (compare Funnel shape from two time periods).

Next Section

Next Section